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Application to Forces 
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"oa ! Abstract 

A simple and stable method for computing accurate expectation values of observable 
with Variational Monte Carlo (VMC) or Diffusion Monte Carlo (DMC) algorithms is pre- 

IT) 

■ sented. The basic idea consists in replacing the usual "bare" estimator associated with 
O ' 

the observable by an improved or "renormalized" estimator. Using this estimator more 

CO 

accurate averages are obtained: Not only the statistical fluctuations are reduced but also 
O ! the systematic error (bias) associated with the approximate VMC or (fixed-node) DMC 

• i-H 

probability densities. It is shown that improved estimators obey a Zero- Variance Zero- 
Bias (ZVZB) property similar to the usual Zero- Variance Zero-Bias property of the energy 
with the local energy as improved estimator. Using this property improved estimators can 
be optimized and the resulting accuracy on expectation values may reach the remarkable 
accuracy obtained for total energies. As an important example, we present the application 
of our formalism to the computation of forces in molecular systems. Calculations of the 
entire force curve of the H2,LiH, and LI2 molecules are presented. Spectroscopic constants 
R e (equilibrium distance) and u e (harmonic frequency) are also computed. The equilib- 
rium distances are obtained with a relative error smaller than 1%, while the harmonic 
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frequencies are computed with an error of about 10%. 
PACS numbers: 71.15.-m, 31.10,+z, 31.25.Nj, 02.70.Lq 
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I. INTRODUCTION 



Over the recent years quantum Monte Carlo (QMC) methods have become more and 
more successful in computing ground-state total energies of molecular systems. For sys- 
tems with large number of electrons the accuracy obtained by QMC is very good. As 
illustrated by a number of recent calculations, [1-14] the quality of the results is com- 
parable and, in most cases, superior to that obtained with more traditional techniques 
(DFT, MCSCF or coupled cluster methods). Unfortunately, for properties other than 
energy the situation is much less favorable and accurate results are difficult to obtain. 
To understand this point let us first define what we mean here by accuracy. In standard 
quantum Monte Carlo schemes there exist essentially two types of error: 

i) the usual statistical error resulting from the necessarily finite simulation time. This 
error present in any Monte Carlo scheme behaves as ~ 1/ \fN where N is the number of 
Monte Carlo steps. 

ii) the systematic error (or "bias" ) associated with some particular choice of the trial 
wave function. In a Variational Monte Carlo (VMC) scheme it is the systematic error 
resulting from the approximate trial probability density. In a fixed-node Diffusion Monte 
Carlo (DMC) it is either the fixed-node error of energy calculations or the systematic 
error associated with the mixed DMC probability density for a more general observable. 
Other types of systematic errors may also exist, e.g. the short-time error, [15] however, 
such errors can be easily controlled and, therefore, will not be considered here. 

Now, to enlighten the major differences between energy and observable computations 
let us compute the expressions of these two errors. We shall do that within the framework 
of the Variational Monte Carlo method where, as we shall see later, all the main aspects 
of this work are already present. 

In a variational Monte Carlo simulation the variational energy 

v ~ <iMV*> ' [ ) 
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where ipr is the approximate trial wave function used, is re-expressed as the statistical 
average of the local energy defined as 

over the probability density associated with ^f,, namely 

E v = (E L )^2. (3) 

An accurate calculation of the energy requires the two following conditions. 

(i) First, the systematic (or variational) error defined as 

A E = E v - E > 0, (4) 

where E is the exact energy, must be as small as possible. 

(ii) Second, the variance of the local energy (which is directly related to the magnitude 
of the statistical error) 

a 2 (E L ) = ((E L -E v )%, T , (5) 

must also be as small as possible. 

To estimate both quantities we express them in terms of the trial wave function error, 
Sip — i^t — ipOi where tpo is the exact wave function. Regarding the systematic error it is 
easy to check that 

{tpT - ipo\H - E \ip T - ip ) 



(iPt\^t) 

In other words, A# is of order two in the wave function error 



(6) 



A e ^O[5^j 2 ]. (7) 
Now, regarding the variance, it is convenient to write the following equality 



from which it is directly seen that a 2 (E L ) is also of order two 

a\E L ) ~ 0[5^\. (9) 

Equations (7) and (9) are at the origin of the high-quality calculations of the en- 
ergy. They show that accurate energy calculations are directly related to good trial wave 
functions: The more accurate the trial wave function is, the smaller the statistical and 
systematic errors are. In the limit of an exact trial wave function, both errors vanish 
and the energy estimator reduces to the exact energy. This most fundamental property 
is referred to in the literature as the "Zero- Variance property" . Note that a much more 
preferable and accurate denomination should be "Zero- Variance-Zero-Bias property" to 
emphasize on the existence of the two types of error. Of course, in the case of the energy 
this distinction is not necessary since, as just seen, the two errors are not independent 
and vanish simultaneously with the exact wave function. However, as we shall see below, 
this peculiar aspect will be no longer true for other properties. 

Let us now turn our attention to the computation of a general observable. Defining 
the expectation value of some arbitrary observable O as follows 

o = (10) 

V ~ (1*11*) ' ^ } 

its Monte Carlo expression is given by 

o v = (oy r . (11) 

It is easy to verify that the systematic error behaves as 

Ao = (0)^- <0)^~0[<ty], (12) 
while the variance is given by 

<r 2 (0)~0[l]. (13) 

Compared to the energy case we have two striking differences. First, the systematic error 
in the averages is much larger. This is a direct consequence of Eq.(12): the estimator of 



a general observable has only a linear zero-bias property instead of a quadratic one like 
in the energy case. Even worse, because trial wave functions are optimized to lower the 
systematic error in the energy (and/or its fluctuations) and not the error in the observ- 
able, the prefactor associated with the linear error contribution, Eq.(12), is usually much 
larger than in the energy case, Eq.(7). In practice, this important systematic error makes 
in general the quality of the expectation value, Eq.(ll), very poor. The second important 
difference is that there is no zero- variance property at all for observables when Eq.(ll) is 
used. Indeed, even when the exact wave function is used as trial wave function we are still 
left with some finite (and eventually large) statistical fluctuations, Eq.(13). Thus, statis- 
tical fluctuations are in general very large for properties. A simple and popular strategy 
to reduce the important systematic error on properties is to mix Variational Monte Carlo 
(VMC) and fixed-node Diffusion Monte Carlo (DMC) calculations to build up a so-called 
"hybrid" or "second-order" estimator, (0) hybrid = 2(0} DMC — (0) VMC , whose error is re- 
duced. [16] An elementary calculation shows that the error is now of order 0[(ipT — ^o) 2 ]; 
plus a linear contribution 0(ipQ N — ipo) due to the approximate nodes of the trial wave 
function. However, once again such a solution is not, in practice, as satisfactory as it 
appears at first glance because of the large prefactor associated with the second-order 
contribution and, also, of the non-negligible linear error due to the nodes. A second pos- 
sible strategy to cope with the systematic error is to perform an "exact" QMC calculation 
based on one of the variants of the so-called "Forward Walking" scheme. [17-19] Unfor- 
tunately, such schemes are known to be intrinsically unstable and, therefore, very time 
consuming. In practice, the possibility of getting or not a satisfactory answer depends 
very much on the accuracy required and on the type of observable considered. Therefore, 
Forward Walking is not considered as a general practical solution to the problem. 

In this work, we propose to follow a quite different route. Our purpose is to show 
that it is possible to use much more efficient estimators for properties than the usual bare 
expression, Eq.(ll). More precisely, it is shown how to construct in a simple and system- 
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atic way new estimators having the same remarkable quadratic zero-variance zero-bias 
property as the energy case. Very recently, we have made a first step in that direction 
by showing how to generalize the zero-variance part of this property. [20,21] In short, the 
basic idea consists in constructing a "renormalized" or improved observable having the 
same average as the original one but a lower variance. To build the renormalized observ- 
able, an auxiliary wave function is introduced. This function plays a role analogous to the 
one played by the trial wave function in the case of the energy: The closer the auxiliary 
function is of the exact solution of some zero-variance equation (the Schroedinger equa- 
tion in the case of the energy), the smaller the statistical fluctuations of the renormalized 
observable are. Our approach has been illustrated on some simple academic examples [20] 
and also for the much more difficult case of the computation of forces for some diatomic 
molecules. [21] Numerical results on these examples are very satisfactory. When suitably 
chosen auxiliary functions are used, statistical errors are indeed greatly reduced. 

Here, we present the full generalization of the preceding idea: it is shown how to 
construct improved observables minimizing both systematic and statistical errors with 
a quadratic behavior similar to that obtained for the energy. As a consequence, any 
observable is expected to be calculated, at least in principle, with the remarkable accuracy 
achieved by QMC for total energies. The basic idea behind our approach is quite simple: 
it consists in making use of the relation between energy and observable calculations as 
expressed by the Hellmann-Feynman (HF) theorem. As well-known this theorem expresses 
any quantum average as a total energy derivative with respect to the magnitude of the 
external potential defined by the observable. It is shown how the zero-variance zero- 
bias principle valid for each value of the energy (as a function of the external potential) 
can be extended to the derivative and, therefore, to the observable. Note that in the 
context of QMC simulations, the idea of using the HF theorem to compute observables, 
using either a finite difference scheme or the analytic derivative, is not new and has 
been applied by several groups [22-30]. In general, the results are good for very small 
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systems but rapidly disappointing for larger systems. Indeed, only when a clear physical 
insight into the origin of the fluctuations of the infinitesimal difference of energy (the 
derivative) is available it is possible to propose an efficient solution to the problem. A 
very nice example of such a possibility is presented in the recent work by Filippi and 
Umrigar. [31], [29]. By using a finite representation of the energy derivative and by 
introducing a special coordinate transformation allowing the electrons close to a given 
nucleus to move almost rigidly with that nucleus, they have shown how to correlate 
efficiently the calculation of the electronic energies associated with two slighlty different 
nuclear configurations of a diatomic molecule. As a result they have been able to get 
accurate estimates of the energy derivatives (forces) for some diatomic molecules. Here, 
we show how the correlated sampling method of Filippi and Umrigar can be re-expressed 
in our framework. In addition, by generalizing their idea it is shown how coordinate 
transformations can be used to define a new class of improved estimators. 

While finishing this work, we came aware of a paper just published by Casalegno, 
Mella and Rappe. [32] The idea underlying their work has some close relations with what 
is presented here. In short, they propose, as we do here, to compute forces using a 
Hellmann-Feynman-type formalism. Their expression to calculate forces is obtained by 
making the derivative of the VMC (or DMC) energy average with respect to nuclear 
positions. To reduce the systematic error these authors propose to employ trial wave 
functions which have been very carefully optimized via energy minimization (let us recall 
that the HF theorem is valid when fully optimized wave functions are used). To decrease 
the very large statistical fluctuations associated with the infinite variance, the improved 
estimator introduced in our previous work on forces [21] is used. As we shall see below, 
the approach proposed by Casalegno and collaborators can be viewed as a special case 
of the general method presented here, except that their estimator does not obey a Zero- 
Variance Zero-Bias property. As we shall see below, this latter aspect has some important 
practical consequences when a high level of accuracy on forces is needed. 
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The organization of the paper is as follows. In Section II we present the Hellmann- 
Feynman theorem and the construction of improved estimators for variational Monte 
Carlo calculations. It is also shown how the idea of Filippi and Umrigar consisting in 
introducing a special coordinate transformation can be used to build up some more general 
and more efficient improved estimators. In Section III we discuss the generalization of 
the formulae to the case of Diffusion Monte Carlo calculations. In Section IV we present 
the application of the formalism to the computation of the entire force curve for the 
H 2 ,LiH, and Li 2 molecules. Calculations of the spectroscopic constants, R e and u e , are 
also reported. Finally, in the last section we summarize our results and present some 
concluding remarks. 



In order to make the connection between energy and observable computations we shall 
make use of the Hellmann-Feynman (HF) theorem which expresses the expectation value 
of an observable as an energy derivative 



where E (X) is the exact ground-state energy of the "perturbed" Hamiltonian defined as 



By choosing various approximate expressions for the exact energy in Eq.(14), it is possible 
to derive various approximate estimates for the average. In the next sections we present 
two choices which turn out to be particularly efficient in practical applications. 

A. Improved estimator built from the variational approximation of the energy 



II. IMPROVED ESTIMATORS FOR OBSERVABLES 



(Wo) dE (X) 



A=0> 



(14) 



if (A) = H + AO. 



(15) 



A most natural choice consists in replacing the exact energy of the HF theorem by 
a high-quality variational approximation. To do that, we introduce some A-dependent 
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approximate trial wave function, ^r(A) to describe the ground-state of H(X) [Note that, 
for the sake of clarity and simplicity, we shall denote in what follows V't(O), H(0), and 
Eq(0) as ipT, H, and £o, respectively]. 

The exact average of the observable can be decomposed as 

(V>o|Wo) dE v (X) , 

where E V {X) is the variational energy associated with i/jt{X) 

E V (X) = (E l (X)), tHx) = ( H( ^ X) W W (17) 

and e some correction depending on 5ip = i/jq — ipx and its derivative, and vanishing when 
the exact wave function is used as trial wave function. 

Now, the important point is that the derivative of the variational energy, dE ^ | A=0 , 
is expected to be a better estimate of the exact average than the ordinary average of the 
bare estimator, Eq.(ll), when properly chosen A-dependent trial wave functions are used. 
This is true since the standard estimator, Eq.(ll), can be re-expressed as a particular 
case of the derivative of the variational energy for a A- independent trial wave function, a 
choice which is clearly not optimal. Before justifying more quantitatively this statement, 
let us rewrite the derivative as an ordinary average over the density tpr 2 - This can be 
easily done, it gives 

dE v (X). 



IA=0 



(ow ( 18 ) 



dX 

where O is a new modified local operator written as 

6=0 + (H -? lW + 2(E L -E»)'^ (19) 

Wt Wt 

In this latter formula, and in the formulae to follow, we shall use the following simplified 
notation 
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where /(A) is some arbitrary function of A. 

Now, we have to justify the first important result, that the new estimator O is a better 
estimator for the exact average than the bare observable O. For that purpose, we compute 
the systematic error in the corresponding average and the variance of the new operator. 
Regarding the systematic error we can write 

A d = \°)^t 2 - = ^ Ia=o- ( 21 ) 

Let us denote tpo(\) the exact groundstate of H(X) [with ipo(0) = ipo\. Using the 
equality 

E V (X)-E (X) = 
(^r(A) - M*)\H(\) ~ Eo(\)\M*) ~ ^o(A)) 



(^t(A)|Vt(A)) 
and choosing the following convention of normalization 



(22) 



<^r(A)|^r(A)> = l, (23) 
the derivative can be easily computed, we get 

A 6 = ($r -ipo\0- (O)^gl^r - il>o) 

+ 2{i> T -MH- E \ijj T ' - ip ') (24) 

As it can be seen, the systematic error is now of order two in the errors ipT — ipo and 

Ao^O^-Vo)^'-^')]- (25) 
Now, let us compute the variance defined as 

( r 2 (0) = ((0-(0)^ 2 )V 2 - ( 26 ) 

Using Eqs. (18), (19) and the fact that E' L = O + ^ H ~^^ T we can express the difference 
O — (0)^ T 2 as follows 
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6 - (0)^ 2 = E L > - E v ' + 2(E L - E v )^-. (27) 

For the sake of clarity, let us distinguish two different contributions in the difference. The 
first contribution is given by 

d[E L (X) - E V (X)} 

h, L — h, v — — | A=0 • l^oj 

Using expression (8) for [-El (A) — E V (X)] and performing the derivative one obtains 

&l — &v — ; 

Wt 

| {H- Eq) {ipT - %) (H - E )(^t - ^o) th' 



+ (0)^ - (O)^ (29) 

This latter expression is clearly of order one in ip T — ip and its derivative,^' — V'o- The 
second contribution in the R.H.S. of Eq.(27) is proportional to El — E v . We have already 
seen that it is of order one in i\) T — *0 O , Eqs.(7), (8). Finally, O — (0)^2 is found to be of 
order one in ipT — V'o and tpx — V'o- The variance, Eq.(26), is therefore of order two 

a 2 {0) ~ 0[{frr - 4> q )(4>t' - 4>o% (30) 

To summarize, using the HF theorem we are able to construct an improved observable 
O, Eq.(19), having a quadratic zero- variance-zero-bias property, Eqs. (25,30), similar to 
what is known for the energy case, Eqs. (7,9). The improved estimator O depends only 
on one single quantity, namely ^t(A). Accordingly, to get accurate results we need to 
choose in the neighborhood of A = a trial function accurate enough to get not only 
a small difference in wave functions but also in the derivative of the wave functions. In 
practice, this latter point is particularly difficult to fulfill. Indeed, at fixed values of A, 
it is known that the minimization of the fluctuations of the local energy can allow an 
important reduction of the error in the trial wave function. However, there is no reason 
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why it should also lead to a satisfactory representation of the derivative of the trial wave 
function. 

In order to escape from this difficulty we propose here to work directly at A = and to 
optimize independently the trial wave function ipT and its derivative ipr ■ Such procedure 
is justified since it corresponds to choose as A-dependent trial wave function the following 
expression 

<MA) =^ T + A^, (31) 

where ip is some new independent function playing the role of a trial function for the 
derivative of the ground-state at A = 0. In this case, the renormalized observable can be 
rewritten under the final form 

0^ 0+ (^M + 2( Bi -£jf. (32) 

where the pair of functions (ipT,^) is the current guess for the exact solution (ip ,ip '). 

Let us now turn our attention on the problem of optimizing the two trial functions 
(ipT,^)- Regarding ip T we know that the standard procedure consists in minimizing the 
variance of the local energy with respect to the parameters of the trial function. Quite 
remarkably, we have here a similar result for %p: the best choice is obtained by minimizing 
the variance of the renormalized operator O with respect to the parameters of if). 

To prove this property it is sufficient to show that the zero-variance (or zero- 
fluctuations) equations for El and O: 

E L = (E l )^ t 

6 = (O)^ (33) 
are equivalent to the equations defining i/j and ip' Q , namely 

(H - E )^j = 

(H - £ )Vo + (0~ (C%2)Vo = (34) 
13 



In these formulae, the first equation is just the ordinary Schroedinger equation. The 
second one is obtained by deriving the Schroedinger equation: 

# (A)V(A) = £ (A)V(A) (35) 

with respect to A at A = 0. Note that equations (34) determine an unique solution, 
(■00, i>b, Eo, (0)^2), as soon as H has a non- degenerate ground-state. Now, using Eqs.(19) 
and (2) for the definition of O and E L) respectively, the system of equations (33) can be 
rewritten under the form 

(H - E v )ij T = (36) 
(H - E v )fa' + (O - (0)^ T = (37) 

which are nothing but Eqs.(34) with (i^t^t') = (V'cbV'd)- Accordingly, the zero- variance 
equations (33) admits this latter pair of functions as unique solution. 

In practical calculations, different strategies of optimization can be employed. A first 
approach consists in minimizing separately the variance of the local energy with respect 
to the wave function tpr and the variance of O with respect to ip. In this way, we 
get an optimal trial wave function ip T for the energy and the best derivative at fixed 
ip T . However, let us emphasize that this approach is not the most general: we can also 
minimize both variances simultaneously with respect to the two independent functions. 
Another remark is that the second equation of system (34) can be viewed as an ordinary 
first-order perturbation equation. This is expected since, when AO is considered as a 
perturbation of the Hamiltonian H, ifj' is nothing but the first-order correction to the 
ground-state and {0)^2 the first-order correction to the energy. 

Finally, let us end this section by commenting in more detail the various terms en- 
tering expression (32) of the improved operator. Three different contributions can be 
distinguished: 

(i) The ordinary bare estimator O corresponding to i/j — 0. 
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(ii) A second contribution given by (if — Eijip/ipT- It is easy to verify that this 
contribution has a zero average over the density ipT 2 

((H-E L )*p/^ T )^ = 0. (38) 

Accordingly, its role is to lower the variance of the improved estimator without changing 
the average of the observable (no influence on the systematic error). Note that for appli- 
cations where the stationary density is known and can be exactly sampled (that is, there 
is no systematic error in the average) the use of contributions (i) and (ii) is sufficient. Im- 
portant examples include all "classical" Monte Carlo simulations based on the Metropolis 
algorithm or one of its variants. Such a possibility was the subject of a previous work. [20] 
(iii) A third term given by 2(E L — E v )^. This contribution has a very small impact on 
the statistical fluctuations since the variance of (El — E v ) is of order two in the trial wave 
function error for any choice of tp. Its main effect is to take into account the change of 
stationary density under the external perturbation defined by the observable and, there- 
fore, to lower the systematic error in the expectation value of the observable. Note that 
in the limit ipT = ?Po, this contribution reduces to zero and, therefore, the average of this 
term can be understood as a correction to the Hellmann-Feynman formula when ipT is not 
the exact ground-state (note that similar corrections to the HF formula exist also in more 
traditional ab initio calculations, e.g. the "Pulay force" [33] resulting from approximate 
Hartree-Fock (or LDA) orbitals in self-consistent schemes). 

B. More improved estimators: use of coordinate transformations 

In this section it is shown how to generalize further our renormalized operators. The 
basic idea of the generalization is based on an original idea recently proposed by Filippi 
and Umrigar in their work on the computation of forces [29] . Working in a finite difference 
formalism they have proposed to compute the forces as a small but finite difference of 
energies for two close enough geometries. In order to minimize the fluctuations they have 
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proposed to use a correlated sampling method in which a common Monte Carlo density 
(the so-called primary one) is used for the two close geometries. Written within our 
notations and taking the limit of the two geometries infinitely close (5R — > is equivalent 
to A — > 0) it means that the variational energy is written under the form 

= * (39) 

where V't(A) is the trial wave function chosen for a parameter A and ip? is the reference 
(primary) trial wave function. 

The price to pay when doing that is the introduction of some additional fluctuations 
associated with the weight tnW _ The remedy they propose to deal with this problem is 
to use a specific coordinate transformation (space-warp transformation) based on physical 
motivations: The transformation is built so that the electrons close to a given nucleus 
move almost rigidly with that nucleus when the geometry is changed. Here, we generalize 
this idea: coordinate transformation can help to minimize the relative fluctuations when 
varying the external parameter A. As a physical consequence, estimators built from the 
derivative are expected to have smaller fluctuations and smaller systematic errors. 

Let us write a general coordinate transformation as follows 

V = T(\,x) (40) 

where the vector x (or y) denotes the set of the 3n e i ec electronic coordinates. Using this 
transformation the variational energy at a given A can be written as 

(^[A,nA,g)]J(A,f)^|g)l), T2 

vi ) = (j(Xx) ^T^ ) ~ ( ) 

where J(A, x) is the Jacobian of the transformation. Introducing the vector field v such 
that at first order in A we have: 

f(X,x) =x + Xv(x) + 0(X 2 ) (42) 
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we can compute the derivative of the variational energy with respect to A at A = 0. After 
some simple but tedious algebra we get the following equality 

where O is a new renormalized operator given by 

6s0+ C- + 2{El _ K M + gKgi -f . (44) 

To derive this expression we have used the fact that the Jacobian defined as 

J(\,x) = det[^^} (45) 
has the following small-A expression 

J(A, f) = def ft,- + A^l + 0(A 2 ) (46) 

and, therefore, 

J(0, f ) = 1 (47) 

^(0,5)=V.i; (48) 

This more general operator is identical to the operator derived in the previous section 
plus a new contribution resulting from the derivative of the coordinate transformation. 
This new term has a zero average over the VMC distribution Accordingly, its main 
role is to reduce further the statistical error. However, it is important to emphasize that, 
when the trial function ip and the vector field v are optimized simultaneously it has also 
an influence on the magnitude of the systematic error. 



III. BEYOND VARIATIONAL MONTE CARLO 



In the preceding section we have shown how to construct improved observables, O, 
associated with accurate expectation values: 
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When the error = ^ — V'o' i n the trial function for the derivative is comparable to the 
error in the trial function for the ground-state, 5ip = ipT — i>o, the accuracy reached with 
the preceding variational estimate, Eq.(49), can be comparable to the very good accuracy 
usually obtained for total energies. However, despite this remarkable improvment, we 
are still left with some small residual systematic error associated with approximate ipx 
and ip. In the energy case it is known that this error can be entirely suppressed (at 
least for systems with no nodes or known nodes) by averaging the local energy over the 
mixed Diffusion Monte Carlo (DMC) probability distribution, n DMC ~ WtWq instead 
of the VMC distribution, tivmc ~ Wt 2 ■ Unfortunately, we have no such result for the 
improved observables defined here. However, as we shall see now, we can still define some 
approximate way for recovering most of the error. 

A natural way of defining an exact extimator for the observable is to consider the 
derivative of the exact DMC energy estimator instead of the VMC one 

E (X) = ( j Bl(A))v> t (a)Vo(a) = ( ^"jjj ~ )vt(a)^o(a) (50) 

Making the derivative and rewritting the result as an ordinary average we get: 

dE (X ) i 
dX 

where O is written as 



I A=0 



(0) M , (51) 



asO+ ^-f 1 ^' + (E L -E )(^ + f). (52) 

Of course, written under the above form, this exact estimator is useless since the exact 
wave function is not known. Here, we propose to make the following natural approxima- 
tion 

£ = *L. (53 ) 
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Therefore, our final approximate DMC estimator is written as 

DMC = 0+ {E - El) ^ + 2 {E L - E )^- (54) 

where ip is as usual our trial function for the derivative of the exact wave function. Note 
that this estimator is very similar to the VMC one, Eq.(32). The only difference lies in the 
value of the average energy, E = (El), entering the definition of Odmc- More precisely, 
we have 

Odmc ~ VMC = 2(E V - E Q )^-. (55) 

Now, in order to reduce further the error let us show that we can generalize the usual 
"hybrid formula" (0) hybnd = 2(0) 

dmc ~ (O)vmc defined for bare observables to the case 
of improved observables. To do that, let us develop the quantity (5%1)\Odmc\$iP) where 

(Sip\d DM c\Sip) = (^t\O DM c\^t) - 2(^ T \d DM c\^o) + (^o\0 D mc\^o) (56) 
which leads to 

2(^t|O dmc # > - (^t|O dmc # t ) = (^oWo) +A + 0(# 2 ) 
where the intermediate quantity A is defined as 

Expanding A in terms of Sip = ipx ~ V'o we get 

A = 2(i, T \E L - E Q \$) - 2<<ty|(if - E L )$) - 4(5^\E L - E Q \$) + 0(# 2 ) 



Using now the equality 



we obtain 



1> 



T 



A = 0{5ifj 2 ) 
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This latter result shows that the error in the hybrid estimator is of order two in 5ip 

2(0dmc)v>tV'o — (Odmc)^ t = (MO^o) + 0(Sij 2 ), (57) 

thus generalizing the standard result for the bare observable. Note that we can use either 
Odmc or Ovmc [Eq.(32)] in this latter formula since the difference between the two 
renormalized operators is proportional to E v — E , Eq.(55), and, therefore, is also of order 
two in Sip [Eq.(7)]. 

When using coordinate transformation we have similar results. The exact DMC esti- 
mator is found to be 

= + {H - ElW + (E L - E )(tL + + YML ~ ^MM (58) 
Wt Wt Wo WtWo 

and we propose to use the following approximate form 

DMC B o + + 2(E L - { E L)) ± + - (59) 

Because the difference (E L — (E L )) is of order Sip it is easy to verify that the error in the 
hydrid estimator given by Eq.(57) remains here also of order two. 

Before ending this section let us emphasize that it is possible to write a closed com- 
putable expression for the exact estimator of the observable, Eq.(52), by expressing the 
unknown quantity as a computable stochastic average. Choosing a A-independent trial 
wave function ip T we can write [34,35,18] 

^o(A,x) = frW^ « e-Io dsE ^ x ' x ^ » x{0)=x (60) 

where x denotes an arbitrary point in configuration space and << ... » x ^ =x denotes the 
sum over all drifted random walks of length T starting at x as obtained in a Pure Diffusion 
Monte Carlo (PDMC) scheme (DMC without branching). [18] Of course, a similar formula 
can also be obtained in a DMC scheme. [17] Now, using formula (60) we get 

M f T ^«0[x(t)]e-Io dsE ^^» x{0)=x 



^= lim dt 1 v n ; T xm=x (61) 

^0 T^+oo7o <K e -J dsE L [XMs)] >>x(Q)=x 
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and, therefore, the exact estimator can be written in terms of a standard part plus a time- 
integral of the two-point correlation function between the local energy and the observable 



+ iim r dt <<(E L -{E L ))m]oime--''^>>^ ( 

T^+ooJ - J dsE L 

« e Jo » x (o)=x 



It is important to emphasize that this latter estimator is exact: averaged over the mixed 
DMC distribution it leads to an unbiased estimate of the exact average. However, the 
correlator can only obtained within a Forward Walking scheme and, therefore, the stability 
in time is not guaranteed. In this work, we shall not use this expression, its implementation 
will be presented in a forthcoming work. 

IV. APPLICATION TO FORCES 

The average force between atoms in a molecular system is defined as 

_ dE (q) 

F * = (63) 
where E (q) is the total electronic ground-state energy for a given nuclear configuration; 
q represents the 3N nud nuclear coordinates (N nud , number of nuclei) and ^ the particular 
force component in which we are interested. 
Defining the local force as follows 

where x represents the 3n e i ec electronic coordinates (n eiec , number of electrons) and V the 
total potential energy operator, and making use of the Hellmann-Feynman (HF) theorem 
the average force can be rewritten as the statistical average of the local force over the 
exact distribution V ; o( x ) : 

F w = (F w (x,q)>^ (x) . (65) 
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Written under this form the various proposals presented in the preceding sections can 
be applied to the calculation of the average force. It is important to emphasize that for 
approximate probability densities (VMC or DMC) the HF theorem is no longer valid and 
a systematic error in the statistical average (-F ?i (x, q)) is introduced. However, it is not a 
problem here since it is the purpose of this work to show that, by using suitable improved 
estimators, this error can be reduced and even suppressed in the zero-bias limit. 

In order to discuss the various aspects of the method we shall restrict ourselves to the 
case of diatomic molecules. Let us consider a diatomic molecule AB with atom A located 
at (R, 0, 0) and atom B located at the origin. The only non-zero component of the local 
force acting on the nucleus A is the x-component given by 



In this work we present a number of VMC and fixed-node DMC calculations for 
the diatomic molecules H2,LiH, and Li2. Implementation of the quantum Monte Carlo 
methods is well-known and will not be discussed here. For the H 2 molecule the trial wave 
function used has the following simple form 



where Ism is a ls-Slater function centered at nucleus M = A,B with exponent \i and c a 
parameter describing the amount of ionic contribution into the wave function. Of course, 
much more accurate trial wave functions can be constructed for H 2 . However, our purpose 
here is to show that such a simple form for ipT is already sufficient to get accurate values 
of the force. 

For LiH and Li 2 we have employed two types of trial wave function. Our main choice is 
standard in QMC calculations for molecules. The trial wave function is made of a determi- 
nant of single-particle orbitals multiplied by a Jastrow factor. The determinantal part is 
obtained from a RHF calculation and only the Jastrow factor is optimized. As we shall see 




(66) 



ip T = (ls A is B + ls B ls A ) + c{ls A ls A + ls B ls B ) 



(67) 
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below, we have also used Valence-Bond (VB)-type wave functions consisting of a number of 
determinants multiplied by a Jastrow factor. We have used such a multideterminantal de- 
scription to reproduce correctly the large interatomic distance regime (dissociation limit). 
In the case of LiH the determinantal part consists of three determinants corresponding 
to the covalent VB resonating structure: (ls£j) 2 [2s£jls.f/ + ls^2s^j] ({lsn, 2su-, op- 
timized atomic orbitals for the Li and H atoms, orbitals occupied by electrons a and (5 
antisymmetrized separately) and one ionic VB structure: {Isu^Ish^-Sh (lsz,i, Is./? opti- 
mized atomic orbitals for the Li + and H~ ions). In the case of Li 2 we have considered a 
six-determinant representation consisting of the three covalent VB structures describing 
the resonance between atomic orbitals (2s a, 2s b), (2p yA ,2p yB ), and (2p ZA ,2p Zg ). This 
latter trial wave function reproduces not only the dissociation limit but also a major part 
of the 2s-2p near-degeneracy. 

In Figures 1,2, and 3 the energy curves obtained for H 2 ,LiH, and Li 2 are presented. 
Upper curves are the VMC curves (open squares joined by a dotted line). For H 2 the 
two parameters c and \x have been optimized for each interatomic distance. For LiH and 
Li 2 the Jastrow-RHF wave function (one determinant) has been used. All the parameters 
entering the Jastrow factor have been optimized for all distances. Optimizations have been 
performed by minimizing the variance of the local energy using the correlated sampling 
method of Umrigar et al. [36]. The first important observation is that, except for H 2 , VMC 
curves are not smooth as a function of R. Such a result is not surprising: It is typical of a 
situation where an approximate trial wave function is optimized independently for different 
values of an external parameter (here, R) with respect to a large number of variables (for 
LiH and Li 2 we have used about 30 independent variational parameters). Depending on 
the initial conditions (which are themselves very dependent on R) the algorithm used for 
minimizing the variance can be trapped within one of the various local minima. As a 
consequence, the actual value obtained for the variance (and the corresponding energy) 
can vary abruptly even when the external parameter is changed smoothly. Of course, this 
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problem can be solved in principle by making very careful optimizations on very large 
samples. Indeed, the functional form of the trial wave function being identical at all 
distances a smooth curve must be got when the correct lowest minimum of the variance is 
obtained at each distance. Here, this is the case for H 2 whose trial wave function contains 
only two variational parameters. However, for large systems including a much larger 
number of variational parameters and nuclear degrees of freedom, the possibility of fully 
optimizing the trial wave function is just irrealistic. As an important consequence, let us 
emphasize that, in practice, there is no hope of obtaining meaningful forces by making 
straight finite differences of optimized variational energies without using some sort of 
correlated sampling scheme. This is a good illustration of how difficult the calculation of 
forces is within a QMC framework. 

Intermediate points (filled squares) are the DMC results obtained from fixed-node 
calculations using the optimized VMC trial wave functions. In sharp contrast with VMC, 
the DMC curves are now regular. This is so because, unlike VMC, fixed-node DMC 
averages do not depend on the particular form of the trial function used, except for 
the nodal structure. Here, the nodal hypersurfaces vary smoothly as a function of the 
distance and, therefore, the corresponding DMC energy curves are smooth within error 
bars. The second important observation is related to the global shape of the curves. 
Ideally, we are interested in having energy curves which differ from the exact curve only 
by a constant independent on the distance. This is indeed the condition to obtain accurate 
derivatives. Here, it is not the case for the VMC curves of LiH and Li 2 : the difference 
between the VMC and exact energy curves is an increasing function of the distance. It 
is not a surprise since in both cases the trial wavefunction is built from a single RHF 
determinant based on delocalized molecular orbitals which leads to a wrong description 
of the dissociation limit. However and, very interestingly, fixed-node DMC results have a 
much better behavior at large distances. As a consequence, one may expect at this stage 
to obtain accurate forces from the derivative of the fixed-node energy curve even when 
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relatively crude wave functions are used. Finally, let us note that the quality of the fixed- 
node calculations for the molecules considered here is quite good. To give an example, at 
the equilibrium distance of the Li 2 molecule, the total energy obtained is -14.9901(6) to be 
compared with the exact non-relativistic value of E =-14.9954. The amount of correlation 
energy recovered within the fixed-node approximation is about 95.7%. A similar quality 
is obtained for other distances and also for the LiH molecule. In the case of the nodeless 
H 2 molecule (no fixed-node approximation), the DMC energies agree perfectly well with 
the exact ones. For the H 2 molecule the variance of the local energy varies between 0.3 
at R — 0.8 and 0.02 at it! = 3.5; for LiH the variance is about 0.07, and for Li 2 it varies 
between 0.09 and 0.2. 

The crucial point when implementing the various formulae presented in the preceding 
section is the choice of the trial function ip for the derivative. In our previous study on 
forces [21] where we have focused our attention on the reduction of statistical fluctuations 
only, we have proposed to employ the minimal form leading to a finite variance of the 
renormalized local force. As can be viewed from Eq.(66), at short electron- nucleus distance 
r the local force behaves as F ~ 1/r 2 and, therefore, the variance (F 2 ) — (F) 2 is infinite. 
This well-known problem has been discussed in different places (See, e.g. [15] Chap. 8.2 
or [27]). Here, the "minimal" form removing the singular part responsible for the infinite 
variance is written as 



Vw(x) = Qip T (68) 



where Q is given by 



Q = ZaY, ^-£f (69) 

i=l \ l i J*- 1 

To see this, we just need to compute the following quantity 

^ = Z a L | r ._ R | 3 " V ^ ' WtA/V- (70) 
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By adding this latter quantity to the bare local force, Eq.(66), the singular part is exactly 
removed, the remaining contribution having a finite variance. In what follows, ip m i n will 
be referred to as the minimal form for ip. 

In Figure 4 we present various VMC calculations of the average force for the Li 2 
molecule as a function of the interatomic distance. A first set of points (filled squares 
points with very large error bars at R=5.,R=6.5, and R—7.5) are results obtained from 
the ordinary bare estimator, Eq.(66). Open squares (with small error bars) joined by 
the dashed curve correspond to results obtained by using ip m i n as trial function for the 
derivative 

FvMC-Zvbh, tmin] = F + (gjL^^L . ( 7 1) 

This estimator can be viewed as the simplest improved estimator we can think of having 
a finite variance; it corresponds to the form employed in our previous study. [21] The 
subscript ZV (Zero- Variance) is used here to emphasize that the improved estimator is 
built to decrease the statistical error only. Circles joined by a dotted line are results 
obtained from the ZVZB improved estimator derived in the preceding section, Eq.(32): 

Fvmc-zvzb^t, VW] = F + {H - EL) ^ mm + 2(E L - (E L ))^. (72) 

Note the use of the subscript ZV ZB to emphasize on the two aspects: reduction of statis- 
tical and systematic errors. Finally, the solid line represents the "exact" non-relativistic 
force curve for Li 2 . 

A first important observation is that using improved estimators is extremely efficient 
in reducing the statistical error. This can be seen by comparing the magnitude of the 
error bars on data obtained from the ordinary bare estimator (filled squares at R = 5., 6.5, 
and 7.5) with those corresponding to other calculations based on improved estimators. 
A reduction of at least two orders of magnitude is observed. As already discussed this 
remarkable result is a direct consequence of the fact that the infinite variance of the 
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bare estimator has been reduced to a finite value. At the scale of the figure error bars 
associated with improved estimators are almost not visible. A more quantitative analysis 
will be given later (see, Table II). 

Now, regarding systematic errors, results are much more disappointing. Using the 
pure Zero- Variance (ZV) renormalized estimator, Eq.(71), the behavior of the average 
force (open squares joined by the dashed line in Fig.4) as a function of R appears erratic. 
This can be easily understood since the term added to the bare force in Eq.(71) has a 
zero-average and, therefore, the erratic behavior is a direct consequence of the irregular 
VMC energy curve presented in Figure 3. When adding the term correcting the average, 
the results are improved. As seen on the figure the behavior of (FvMC-zvzB[i>T,i>min]), 
Eq.(72), as a function of R, is much less irregular, thus illustrating the important role 
played by the zero-bias additional contribution [third term of the R.H.S. of Eq.(72)] to 
correct the error due to the approximate trial wave function. Despite of that, the resulting 
curve is far from being satisfactory. To weaken the role played by ipr we can think of 
going beyond VMC calculations. In Figure 5 we present such calculations for Li 2 using 
the DMC-ZVZB improved estimator F DMC -zvzBH'T,i J min} written as 

F DM C-ZVZB^T, ilmin] =F+ { - - + 2(E L - (E L ))^ (73) 

where the energy average is a fixed-node DMC average, and, also, results obtained by 
using the generalized hybrid formula, Eq.(57) 

F ^ 2(F DM c-zvzB)i> T i>o — (Fvmc-zvzb)^ (74) 

A clear improvment is observed when going from VMC (open circles) to DMC (filled 
squares) and, then, to hybrid calculations (open squares): The systematic error present 
in VMC calculations is reduced. However, the resulting curves are still not satisfactory. 
Extracting from them a meaningful equilibrium distance or first derivative of the force 
curve (calculation of uj e ) is impossible. Very similar behaviors have been obtained for H 2 
and LiH. They do not need to be reproduced here. 
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The main reason for the poor results just presented is the low quality of the trial 
function ip m in used. According to our general presentation of Sec. II we know that a good 
trial function ip must be close to the derivative of the exact ground-state wave function 
with respect to R. Here, this is only true when an electron approaches the nucleus A (note 
that nucleus B has been fixed at the origin and, thus, has no pathological contribution). 
In that case the non-vanishing part of the exact wave function is expected to behave as 

V'o ~r i ^Rexp(-Z A |r i -R|) (75) 

which leads to 



dR r ^ R-ZM ]r,.-R! 



-Vo (76) 



which is nothing but (up to a minus sign) the minimal form for ip given above, Eqs. (68,69). 

In order to improve our trial wave function ip we propose to use the following finite- 
difference form 

7 MR + ar, p (r + ARJ\ - MR,p(R)] m 

WDeriv ~ • {< < ) 

In this expression p(R) denotes the complete set of variational parameters entering the 
trial wave function (coefficients of molecular orbitals, basis set exponents, Jastrow param- 
eters, etc.). The main advantage of using a finite-difference form instead of the exact 
derivative is practical: To estimate the derivative we only need to compute two additional 
local energies and, thus, we avoid deriving and programming the lengthy expressions re- 
sulting from the explicit derivative. Note also that using an approximate finite-difference 
representation is not a problem here: In any case ipDeriv must be considered as an ap- 
proximate trial function for the exact derivative and AR can always be interpreted as a 
new additional variational parameter for ip. In practical calculations, the complete set 
of parameters we use for minimizing the fluctuations of the various improved estimators 
consists of {p(R), p{R + AR), and AR}. 
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At the VMC level we consider the following form for the improved estimator 



FvMC-ZVZB \*pTi 4>Deriv, v\ 



F + 



(H — E L )ip Deriv 

4>T 



+ 2(E L -{E L )) 




(78) 



At the DMC level the expression used is very similar, see Eq.(59). In this expression the 
vector field v associated with the coordinate transformation is chosen as follows 



where u x is the unit vector along the x-axis. The vector field depends on two parameters 
a and (3, which are optimized to lower the variance of F V mc-zvzb- The vector field is 
built so that electrons close to the nucleus A translate with the nucleus, while electrons 
far away do not move. In practice, we compute the additional term associated with the 
coordinate transformation using a finite-difference scheme along the direction defined by 
the vector v 



(E L - (E L ))V.v+ [(E L - (E L ))(x + ev) YT ".^ 1 - (E L - (E L ))(x)]/e (80) 



where x represents the electronic coordinates and e a small positive quantity whose mag- 
nitude can also be optimized. 

In Figure 6 we present VMC calculations for Li 2 using FvMC-zvzB[ l l>T,' t f>min] and 
FvMC-zvzB[i J T,i J Deriv,v\ as improved estimators. The two estimators have been evalu- 
ated on the same Monte Carlo samples. There are two striking differences when using the 
second estimator FvMC-zvzB[ipT,ipDeriv,v\- First, the gain in statistical error is spectac- 
ular (about one order of magnitude, for a quantitative analysis see discussion below, Table 
II). Second, the curve is much more regular and closer to the exact result (solid line). 
The VMC results are very satisfactory at small distances (between R=3. and R=4.). 
However, at larger interatomic distances, the VMC curve begins to separate from the 



e -ar iA -l3r^ 



(79) 



i=l 



V[(E L -(E L ))r T v] 
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exact one. This is due to the fact that the wave function is built from a RHF calculation 
and, therefore, the dissociation limit is not correctly described. To address this problem 
we have considered a more sophisticated trial wave function consisting of a product of a 
Jastrow factor and a six-determinant one-particle part (for a more detailed description, 
see above). This VB-type trial wave function has been used only for the largest distances: 
R=7.,R=7.5,R=8, and R=8.5 In figures 7 and 8 the comparison between results obtained 
with the Jastrow-RHF (one determinant) and the Jastrow- VB (six determinants) wave 
functions is presented. At the VMC level (Fig. 7), the improvment resulting from the mul- 
tideterminant wave function is clearly seen, the forces computed are much closer to the 
exact curve than in the one-determinant case. At the DMC level (Fig. 8) we could expect 
that this error disappears even with the Jastrow-RHF (one determinant) wave function 
since the DMC results depend only on the nodal structure of the wave function. However, 
it is not true. The difference between the DMC curve and the exact one is still important 
at large distances like in the VMC case. This result takes its origin in the approximation 
made for the exact derivative of the wave function in the DMC estimator, Eq.(53) (^- is 
replaced by ^r)- When using the Jastrow- VB wave function the DMC results obtained 
are much better. 

We are now in a position to present our final curves for the three molecules ob- 
tained with our best fully-optimized estimator F Z vzB[i J T,i J Deriv,v\ and the hybrid for- 
mula. Results for the molecules H 2 ,LiH, and Li 2 are presented in Figures 9,10,11, re- 
spectively. As seen from these figures the overall agreement between the exact curves 
(solid lines) and QMC results (open squares) is very good. To be more quantita- 
tive we have extracted from these curves an estimate of the spectroscopic constants 
R e (equilibrium distance) and u e (harmonic frequency). To do that, the data have 
been fitted with a functional form given by the derivative of a Morse potential curve 
E(R) = _D[exp —2f3(R — R e ) — 2 exp —2f3(R — R e )\ over some interval of distances around 
the equilibrium geometry (R between 1.1 and 2. for H 2 , between 2.6 and 4. for LiH, and 
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between 4. and 6. for Li 2 ). Parameters D,(3, and R e have been determined via a gener- 
alized least-squares fit. Our results at the VMC, DMC, and Hybrid levels are presented 
in Table I and compared to experimental values. [37] As seen from the Table results for 
the equilibrium distances are excellent. The largest systematic errors are obtained at the 
VMC level (relative errors of 4. 3%, 3. 3%, and 5.7% for H 2 ,LiH, and Li 2 , respectively). 
A reduction of a factor of about two is gained when DMC calculatons are performed. 
Finally, using the hybrid formula, the exact equilibrium distances are recovered within 
statistical errors (the relative statistical errors being 1.1%, 0.3%, and 0.5% for H 2 ,LiH, and 
Li 2 , respectively). In contrast, results for the harmonic frequencies are less accurate but 
still satisfactory. For the H 2 molecule, the exact experimental result is almost recovered 
within statistical error at the VMC, DMC, and Hybrid levels, the relative statistical error 
being between 3 and 4%. For LiH and Li 2 the relative statistical errors are of the same 
order of magnitude. However, a non-negligible systematic error of about 10% is found for 
these molecules. This result illustrates that obtaining accurate harmonic frequencies is 
more difficult than obtaining accurate equilibrium geometries. 

Now, we would like to present a more quantitative discussion of the performance of 
the various force estimators introduced in this work. This will allow us to summarize 
the various aspects of the method, to present some comparisons with the recent results 
obtained from the improved estimator implicitly used by Casalegno and collaborators [32], 
and, also, to emphasize on some important quantitative issues. In Table II the systematic 
and statistical errors associated with the various force estimators at the VMC, DMC and 
Hybrid levels of calculation are presented. The results shown are for the Li 2 molecule 
at the equilibrium geometry (R=5.051 a.u.) where the exact force average, denoted as 
{F) ex , is equal to zero. To allow direct comparisons between force estimators all averages 
have been computed in a common Monte Carlo calculation (identical MC samples). To 
compare with, we also give the systematic and statistical errors on the total energy. To 
give a measure of the fluctuations of each estimator the corresponding variances at the 
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VMC level, <r 2 (VMC), are reported. To facilitate comparisons between data all averages 
(except for the VMC variances) are given with five significant figures after the decimal 
point and all statistical errors are given on the fifth decimal place (magnitude 1CT 5 ). 
The first estimator presented is the bare estimator, F [Eq.(66)]. As already pointed out, 
this estimator, which has an infinite variance, displays very large statistical fluctuations 
(between two and three orders of magnitude with respect to the improved estimators 
to follow) and is, therefore, not at all suitable for practical calculations. The second 
estimator, Fzv[i>T,' l Pmin] [Eq.(71)], introduced in our previous work on forces [21] is the 
simplest estimator having a finite variance. However, as explained above, when using such 
an estimator no control on the systematic error exists. The third estimator presented, 
FzvzB[i J T,i J min\, is the simplest estimator having the ZVZB property. We can see that 
the introduction of the contribution associated with the ZB property, 2(E L — {E L ))' ! ^ L 
is efficient in reducing the bias (the DMC and hybrid errors are roughly divided by a 
factor two). However, as already discussed, the derivative of the trial wave function is not 
correctly reproduced as a function of the interatomic distance and the corresponding force 
curve is not smooth (see, Figures 4,5). To get accurate and well-behaved (as a function of 
R) values of the force it is important to introduce an auxiliary function close to the exact 
derivative of the wave function. The most simple estimator based on this idea and having 
a finite variance can be constructed by using the minimal form ip min [Eqs.(68),(69)] for 
the Zero- Variance part and ^Dera>,[Eq.(77)], for the Zero-Bias part. Such an estimator is 
written as 

= F + (H - EL) ^ mm + 2(E L - (E L })$^ (81) 

Written with our notations this is in fact the estimator implicitly used by Casalegno 
and collaborators in their very recent work [32]. In contrast with the other estimators 
presented here this "mixed" estimator (different trial functions ip are used for the ZV 
and ZB parts) has no ZVZB property. Accordingly, our general optimization procedure 
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based on the minimization of the improved-estimator variance is no longer meaningful 
here. Nevertheless, as pointed out by Casalegno et ai, to optimize estimator (81) we still 
have the possibility of optimizing the parameters of the trial wave function ip T via energy 
minimization. Such a procedure is justified because fully-optimized trial wave functions 
are known to verify the Hellmann-Feynman theorem. Statistical errors associated with 
this estimator are reasonable and roughly similar to those obtained with the two previous 
estimators. Systematic errors are also comparable. However, in sharp contrast with all 
ZVZB-improved-estimators introduced in this work, the DMC (and hybrid) calculations 
do not improve the results and, as seen in the Table the hybrid results can be even 
bad, despite the fact that VMC results are reasonable. Regarding the dependence of the 
results on the interatomic distance we have been able to recover a relatively smooth force 
curve for the smallest molecules H 2 and LiH but not for Li 2 . In this latter case, the 
systematic error is found to be too much sensitive on the quality of the optimization of 
the trial wave function to lead to reliable results. The last improved estimator presented 
in Table II is our best proposal for the force estimator, Eq.(78). We report results with 
(v ^ 0) and without (v = 0) to enlighten the role of the coordinate-transformation term. 
As seen, the introduction of the v-tevm is extremely efficient in reducing the statistical 
error. For example, at the VMC level the statistical error without this term is 218. 1CT 5 , 
while the optimized improved estimator using the v-term is decreased down to 9.10 -5 . The 
reduction gained in statistical error is more than one order of magnitude. This remarkable 
result is general : It is valid for all molecules and all distances treated here. Another most 
important point is that our best improved estimator (78) is the only estimator presented 
in this work whose statistical error (here, 9.10" 5 ) is (much) smaller than the energy one 
(here, 32.10 -5 ). Note that it is also true for the systematic error (whatever the level 
of calculation). Such a result is particularly important since it has been found that a 
precise control of the magnitude of the systematic error through variance minimization 
of the improved force estimators is possible only when such a condition is verified. In 
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contrast, when the statistical error on the force is larger than the energy error, the variance 
minimization can lead to various results and to get a smooth force curve is very difficult. 
Actually, we would like to emphasize that obtaining results of the quality presented in 
Figure (11) for the Li 2 molecule has only been possible with the improved estimator (78). 
Using other estimators we have not been capable of constructing a reasonable force curve 
(smooth and accurate) for this molecule. 

Finally, let us say a word about the dependence of our results on the optimization 
process (determination of the optimal parameters entering vp, ipT, and v by minimization 
of the variance of the improved estimator). Clearly, the method presented in this work is 
useful only if the results obtained do not depend too much on the way the optimization is 
performed and on which particular minimum has been found for the variance (as already 
emphasized, when a large number of parameters are considered the location of such a 
minimum can depend very crucially on the initial conditions and/or on the random num- 
bers series used). To quantify this aspect we have made 9 independent optimizations over 
9 independent sets of 2000 walkers for the Li 2 molecule at the equilibrium geometry with 
our best estimator. Results show that the VMC average force results may vary in a sig- 
nificant way for the different sets of optimized parameters found. In this case, the domain 
of variation is about twenty times the magnitude of the statistical error and about 30% 
of the average itself. However, it has been observed that the DMC and Hybrid averages 
are much less sensitive on the optimized parameters. The error on the DMC and hybrid 
average forces due to an incomplete optimization has been found to be of the order of 
magnitude of the statistical error which is rather small. 



V. SUMMARY AND CONCLUSIONS 

In this work we have shown how to construct improved VMC or DMC estimators for 
observables. By improved it is meant that, compared to the standard bare estimator O, 
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the new estimators O have a lower variance and a reduced systematic error when averaged 
over the approximate VMC or (fixed-node) DMC probability densities. 

At the Variational Monte Carlo level the most general form we propose for O is given 

by 

where averages are defined over the VMC distribution. At the Diffusion Monte Carlo level 
the expression proposed is essentially similar, except that the average of the local energy 
entering the definition of DM c is defined over the DMC distribution, Eq.(59). The 
various terms defining the improved observables have a well-defined physical origin: The 
three first contributions result from the change of the energy average when the magnitude 
of the observable considered as an external field is varied, while the last contribution comes 
from the use of a coordinate transformation correlating electron displacements and change 
of the external field. The functions ipx and ip appearing in the improved observables play 
the role of trial functions: ipT is the ordinary trial wave function for the ground-state of 
H and is a guess for the derivative of the exact ground-state wave function, ipo(X), of 
the perturbed Hamiltonian 

H(X) =H + XO 

with respect to A at A = 0. When the trial functions are exact, {jpr, i>) = (fpo, d ^x^ L=o)> 
the improved estimator reduces to a constant, namely the exact average for the observable. 
In that case both statistical and systematic errors vanish. We have called this remarkable 
property "Zero- Variance Zero-Bias property" (ZVZB). In the neighborhood of the exact 
solution, a local expansion of the various quantities obtained from the approximate guess 
(ipTjip) can be done. It is found that there is a quadratic behavior in the errors Sip = 
ip T — ip and 5il)' — tjj — ip' Q . At the VMC level it reads 

a 2 (0) = ((6 - (O)^) 2 )^ ~ 0[W] (83) 
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and 

Ao = (O)^ - (OU* ~ 0[W], (84) 

with a similar result in the DMC case. This important result generalizes the well-known 
quadratic Zero- Variance Zero-Bias property of the energy where the local energy, E L = 
Hipx/ipT plays the role of the improved estimator: 

a 2 (E L ) ~ 0[5ip 2 } (85) 

and 

A El ~ 0[6i/> 2 ]. (86) 

In the case of the energy we can write a Zero- Variance Zero-Bias equation defining the 
optimal trial wave function by imposing that the local energy reduces to a constant, 
namely the exact energy 

^ = E . (87) 

Of course, this equation is nothing but the Schroedinger equation. Here, the Zero- Variance 
Zero-Bias equation for the observable is obtained by imposing that the improved observ- 
able O reduces to the exact average 

6 = (C%2. (88) 

By optimizing the three quantities (ifr, if), v) so that fluctuations of O are minimal we can 
obtain the optimal improved estimator for the observable. In practice, it is done in two 
steps. First, functional forms for the trial functions ipr and tp are chosen in order to re- 
produce the best as possible the exact solution of the zero- variance equations. The choice 
of the vector field v is done on physical grounds: It corresponds to the electron-coordinate 
transformation, y = x + \v[x) + 0(A 2 ), correlating as much as possible the electron dis- 
placements and the change of density associated with the external field defined by the 
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bare observable. Second, the various parameters entering the three quantities are opti- 
mized by minimizing the fluctuations of O over a large but finite number of configurations 
(typically, several thousands) drawn according to the VMC or DMC distributions. 

It is important to emphasize that by using the improved estimators presented here it 
is possible to get an accuracy on expectation values of observables which is comparable 
to the very good one obtained for total energies. As it can be seen from Eqs. (83,84,85,86) 
this is true when we are able to reduce the error on the derivative of the wave function 
at the level of the error on the wave function itself, that is Sip ~ 5ip' . 

Another fact worth pointing out is that there is not an unique way of constructing 
improved estimators. Here, we have built our estimators by considering the derivative of 
the variational, Eq.(18), or the exact DMC energy average, Eq.(51), We have also consid- 
ered the possibility of making a coordinate transformation before making the derivative, 
Eqs. (41,43). Of course, we can think of many other choices and/or transformations. Ul- 
timately, the better strategy will depend very much on the specific problem considered. 

Finally, in order to go beyond VMC or DMC calculations, we have shown that the 
reduction of error of one order in 5ip associated with the popular "hybrid" (or "second- 
order") formula mixing DMC and VMC averages can be generalized to the case of our 
improved estimators: 



As an important application we have applied our formalism to the case of the com- 
putation of forces for some simple diatomic molecules. In our preceding work on forces 
[21] we have focused our attention only on the zero- variance part of the problem. More 
precisely, we have employed i) a simplified version of the renormalized force, Eq.(71) ii.) 
the minimal expression for ip leading to a finite variance, Eqs. (68,69), and iii.) the hybrid 
formula mixing VMC and DMC calculations, Eq.(89). Results obtained for the vanishing 
force at the equilibrium distance for a number of small diatomic molecules were reasonably 




(89) 
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good. Here, we have illustrated that such a strategy is in fact not valid for describing the 
global shape of the force curve. It has been shown that results depend very much on the 
trial wave function used and, particularly, on the quality of the optimization process of the 
numerous parameters of the trial wave function. As a result, the force curves obtained 
are not regular as a function of the interatomic distance and important spectroscopic 
quantities such as the equilibrium distance R e and the harmonic frequency uj e cannot 
be obtained reliably. To get accurate curves we need not only to have a small amount 
of statistical fluctuations but also a control of the systematic error. By exploiting the 
general ZVZB principle presented in this work it has been shown that obtaining accurate 
curves is now possible. The basic ingredients are: i.) the use of a trial wave function for 
the derivative, ip, built as a finite-difference of the trial wave function with respect to the 
nuclear coordinate ii.) the use of a coordinate transformation in the spirit of the work of 
Filippi and Umrigar [29] and, finally, iii.) the systematic minimization of the variance of 
the improved estimator with respect to all the parameters entering the two trial functions 
(ipT, V0> an d the vector field, v, associated with the coordinate transformation. Let us 
emphasize that to get a well-balanced optimization of the two trial functions (leading to 
smooth curves for the forces) , it is essential to reduce the variance of the improved estima- 
tor for the force at the level of the variance of the local energy. Although such a condition 
may appear as very difficult to fulfill (local energies have usually very small variances), 
we have shown that it is in fact possible thanks to the coordinate-transformation term. 
Such a result is remarkable and is certainly one of most important practical aspect of the 
approach proposed in this work. 

Finally, let us remark that the price to pay with respect to the minimal scheme pre- 
sented in our previous work [21] lies on the need of computing about 3N nuc i local energies 
to calculate the various components of the force. However, we do not think it repre- 
sents a major difficulty for the realistic applications to come. Indeed, the few-percent 
accuracy needed on average forces will be obtained with relatively small statistics and, 
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therefore, it will not be necessary to compute the force vector at each Monte Carlo step 
(the expensive 37V nuc ;-local energy-calculation step will be done rarely). In addition to 
this, in applications where the nuclear geometry is varied during the simulation (Molecu- 
lar Dynamics-type applications) it should also be possible to use suitable re-actualization 
schemes to avoid re-computing entirely the 3N nuc i local energies for close nuclear config- 
urations. Of course, the validity of these various strategies as well as the quality of the 
improved estimators presented in this work need now to be checked for realistic applica- 
tions involving many nuclear degrees of freedom. 
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TABLES 

TABLE I. VMC, DMC, and Hybrid estimates of the equilibrium geometry R e (a.u.) and 
harmonic frequency u; e (cm -1 ). The atomic isotopic masses taken" are 1.007825035 amu for X H 
and 7.0160030 amu for 7 Li. 





H 2 


LiH 


Ll2 


R e (VMC) 


1.463(12) 


3.111(17) 


5.346(27) 


R e (DMC) 


1.426(13) 


3.056(6) 


5.200(16) 


R e (Hybrid) 


1.395(15) 


3.001(15) 


5.068(27) 


R e (Exp.) 6 


1.401 


3.015 


5.051 


io e (VMC) 


4194(130) 


1559(40) 


366(9) 


u e (DMC) 


4432(165) 


1549(22) 


373(5) 


uj e (Hybrid) 


4662(205) 


1519(31) 


387(8) 


(Ex P .) fc 


4395.2 


1405.65 


351.4 



Ref. [38] 
Ref. [37] 
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TABLE II. VMC, DMC, and Hybrid systematic (bias) and statistical errors for the total 
energy and various force estimators for Li2 at R=5.051 a.u. The VMC variances, a 2 (VMC), 
are also given. (Ei) ex and (F) ex denote the exact total energy, (£x) ex =-14.9954 a.u. and exact 
force (F) ex = 0. (equilibrium geometry), respectively. To facilitate comparisons between energy 
and force results all averages are given with five significant figures after the decimal point and 
statistical errors are given on the fifth decimal place (magnitude 1CT 5 ). Statistical errors on 
VMC variances are on the last digit. 



Estimator 


VMC Average 


DMC Average 


Hybrid 


cj 2 (VMC) 


E L - (E L ) ex 


0.03871(32) 


0.00531(50) 




0.113(5) 


a F - (F) ex 


0.18217(23216) 


0.15462(12293) 


0.12707(33185) 


+oo 


h F Z v\^T^min\ ~ {F) ex 


-0.06352(84) 


-0.04003(151) 


-0.01654(313) 


1.27(5) 


C F Z VZB[^T,tpmin] ~ (F) ex 


-0.05802(104) 


-0.02484(184) 


0.00834(382) 


1.3(2) 


d F[lp T ,'>Pmin\'<pDeriv] ~ (F) ex 


0.00619(109) 


0.02993(187) 


0.05367(390) 


2.8(1) 


e F Z VZB[^T,i>Deriv,V = 0] - (F) ex 


0.00871(218) 


0.00474(147) 


0.00077(366) 


14(3) 


e FzVZB[<pT,i>Deriv,v\ ~ (F) ex 


0.00692(9) 


0.00358(19) 


0.00024(39) 


0.016(1) 



a. Eq.(66) 

b. Eq.(71) 

c. Eq.(72) 

d. Eq.(81) 

e. Eq.(78) 
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FIGURE CAPTIONS 



Fig.l H 2 molecule. Variational Monte Carlo (VMC) energies (open squares), Diffu- 
sion Monte Carlo (DMC) energies (filled squares) and exact non-relativistic curve 
(solid line). The dotted line between VMC results is a simple linear interpolation 
to guide the eye. 

Fig. 2 LiH molecule. Variational Monte Carlo (VMC) energies (open squares), 
fixed-node Diffusion Monte Carlo (DMC) energies (filled squares), and exact non- 
relativistic curve (solid line). The dotted line between VMC results is a simple linear 
interpolation to guide the eye. 

Fig. 3 Li 2 molecule. Variational Monte Carlo (VMC) energies (open squares), 
fixed- Node Diffusion Monte Carlo (DMC) energies (filled squares), and exact non- 
relativistic curve (solid line). The dotted line between VMC results is a simple linear 
interpolation to guide the eye. 

Fig. 4 Various VMC average forces for Li 2 . Filled squares with large error bars: (F), 
Eq.(66). Open squares joined by the dashed line: {Fvmc-zvH't, "ipmin}) , Eq.(71); 
Circles joined with the dotted line: {FvMC-zvzBi^T^min}) , Eq.(72). Solid line: 
exact non-relativistic force curve. 

Fig. 5 Li 2 molecule. Average forces using F Z vzb{^t, V'mm),Eq.(72,73,74). VMC 
average: lowest curve with open circles. DMC average: intermediate curve with 
filled squares. Hybrid average: highest curve with open squares. Solid line: exact 
non-relativistic force curve. Dashed lines between QMC results are a simple linear 
interpolation to guide the eye. 

Fig. 6 VMC force for Li 2 . Lowest irreg- 

ular curve with filled squares: (FvMC-zv[i J T,i J min\) , Eq.(71). Upper curve with 
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open squares: {FvMC-zv[ l pT,'^Deriv,v\), Eq.(78). Solid line: exact non-relativistic 
force curve. 

Fig. 7 VMC force for Li 2 . Open squares: Average VMC forces from estimator (78) 
using the Jastrow-RHF one-determinant wave function. Open circles: Average VMC 
forces from estimator (78) using the Jastrow-VB six-determinant wave function. 
Solid line: exact non-relativistic force curve. 

Fig. 8 DMC force for Li 2 . Open squares: Average fixed-node DMC forces using 
the Jastrow-RHF one-determinant wave function. Open circles: Average fixed-node 
DMC forces using the Jastrow-VB six-determinant wave function. Solid line: exact 
non-relativistic force curve. 

Fig. 9 Hybrid force for H 2 . Solid line: exact non-relativistic force curve. 
Fig. 10 Hybrid force for LiH. Solid line: exact non-relativistic force curve. 
Fig. 11 Hybrid force for Li 2 . Solid line: exact non-relativistic force curve. 
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